AITopics | key-value store

Collaborating Authors

key-value store

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Towards Efficient Key-Value Cache Management for Prefix Prefilling in LLM Inference

Zhu, Yue, Yu, Hao, Wang, Chen, Liu, Zhuoran, Lee, Eun Kyung

arXiv.org Artificial IntelligenceMay-29-2025

--The increasing adoption of large language models (LLMs) with extended context windows necessitates efficient Key-V alue Cache (KVC) management to optimize inference performance. We analyze real-world KVC access patterns using publicly available traces and evaluate commercial key-value stores like Redis and state-of-the-art RDMA-based systems (CHIME [1] and Sherman [2]) for KVC metadata management. Our work demonstrates the lack of tailored storage solution for KVC prefilling, underscores the need for an efficient distributed caching system with optimized metadata management for LLM workloads, and provides insights into designing improved KVC management systems for scalable, low-latency inference. Large Language Models (LLMs) have shown remarkable ability in tasks like text generation, translation, and question-answering, but their attention architecture introduces significant challenges. The use of key-value caches (KVC) in attention layer of transformer models, while essential for efficient token generation, requires substantial memory resources.

large language model, natural language, workload, (17 more...)

arXiv.org Artificial Intelligence

2505.21919

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Learning to Optimize LSM-trees: Towards A Reinforcement Learning based Key-Value Store for Dynamic Workloads

Mo, Dingheng, Chen, Fanchao, Luo, Siqiang, Shan, Caihua

arXiv.org Artificial IntelligenceSep-17-2023

LSM-trees are widely adopted as the storage backend of key-value stores. However, optimizing the system performance under dynamic workloads has not been sufficiently studied or evaluated in previous work. To fill the gap, we present RusKey, a key-value store with the following new features: (1) RusKey is a first attempt to orchestrate LSM-tree structures online to enable robust performance under the context of dynamic workloads; (2) RusKey is the first study to use Reinforcement Learning (RL) to guide LSM-tree transformations; (3) RusKey includes a new LSM-tree design, named FLSM-tree, for an efficient transition between different compaction policies -- the bottleneck of dynamic key-value stores. We justify the superiority of the new design with theoretical analysis; (4) RusKey requires no prior workload knowledge for system adjustment, in contrast to state-of-the-art techniques. Experiments show that RusKey exhibits strong performance robustness in diverse workloads, achieving up to 4x better end-to-end performance than the RocksDB system under various settings.

compaction policy, transition, workload, (15 more...)

arXiv.org Artificial Intelligence

2308.07013

Country:

Europe > Austria > Vienna (0.14)
Asia > Singapore (0.04)
North America > United States > New York > New York County > New York City (0.04)
(3 more...)

Genre: Research Report > New Finding (1.00)

Industry: Information Technology > Services (0.67)

Technology:

Information Technology > Information Management (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Building a Scalable ML Feature Store with Redis

#artificialintelligenceMar-31-2021, 02:00:28 GMT

When a company with millions of consumers such as DoorDash builds machine learning (ML) models, the amount of feature data can grow to billions of records with millions actively retrieved during model inference under low latency constraints. These challenges warrant a deeper look into selection and design of a feature store -- the system responsible for storing and serving feature data. The decisions made here can prevent overrunning cost budgets, compromising runtime performance during model inference, and curbing model deployment velocity. Features are the input variables fed to an ML model for inference. A feature store, simply put, is a key-value store that makes this feature data available to models in production. At DoorDash, our existing feature store was built on top of Redis, but had a lot of inefficiencies and came close to running out of capacity. We ran a full-fledged benchmark evaluation on five different key-value stores to compare their cost and performance metrics.

feature store, key-value store, redis, (16 more...)

#artificialintelligence

Industry:

Information Technology > Services (0.83)
Consumer Products & Services > Food, Beverage, Tobacco & Cannabis (0.83)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.89)

Add feedback

Interleaved Sequence RNNs for Fraud Detection

Branco, Bernardo, Abreu, Pedro, Gomes, Ana Sofia, Almeida, Mariana S. C., Ascensão, João Tiago, Bizarro, Pedro

arXiv.org Machine LearningFeb-14-2020

Payment card fraud causes multibillion dollar losses for banks and merchants worldwide, often fueling complex criminal activities. To address this, many real-time fraud detection systems use tree-based models, demanding complex feature engineering systems to efficiently enrich transactions with historical data while complying with millisecond-level latencies. In this work, we do not require those expensive features by using recurrent neural networks and treating payments as an interleaved sequence, where the history of each card is an unbounded, irregular sub-sequence. We present a complete RNN framework to detect fraud in real-time, proposing an efficient ML pipeline from preprocessing to deployment. We show that these feature-free, multi-sequence RNNs outperform state-of-the-art models saving millions of dollars in fraud detection and using fewer computational resources.

dataset, sequence, transaction, (13 more...)

arXiv.org Machine Learning

2002.05988

Country:

North America > United States > California > San Diego County > San Diego (0.05)
North America > United States > New York > New York County > New York City (0.04)

Genre: Research Report (0.70)

Industry: Law Enforcement & Public Safety > Fraud (1.00)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.89)

Add feedback

Towards Better Interpretability in Deep Q-Networks

Annasamy, Raghuram Mandyam, Sycara, Katia

arXiv.org Machine LearningSep-14-2018

Deep reinforcement learning techniques have demonstrated superior performance in a wide variety of environments. As improvements in training algorithms continue at a brisk pace, theoretical or empirical studies on understanding what these networks seem to learn, are far behind. In this paper we propose an interpretable neural network architecture for Q-learning which provides a global explanation of the model's behavior using key-value memories, attention and reconstructible embeddings. With a directed exploration strategy, our model can reach training rewards comparable to the state-of-the-art deep Q-learning models. However, results suggest that the features extracted by the neural network are extremely shallow and subsequent testing using out-of-sample examples shows that the agent can easily overfit to trajectories seen during training.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

arXiv.org Machine Learning

1809.0563

Country: North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)

Genre: Research Report > New Finding (0.48)

Industry: Leisure & Entertainment > Games (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Deploying Deep Ranking Models for Search Verticals

Ramanath, Rohan, Polatkan, Gungor, Xu, Liqin, Lee, Harold, Hu, Bo, Zhou, Shan

arXiv.org Artificial IntelligenceJun-6-2018

In this paper, we present an architecture executing a complex machine learning model such as a neural network capturing semantic similarity between a query and a document; and deploy to a real-world production system serving 500M+users. We present the challenges that arise in a real-world system and how we solve them. We demonstrate that our architecture provides competitive modeling capability without any significant performance impact to the system in terms of latency. Our modular solution and insights can be used by other real-world search systems to realize and productionize recent gains in neural networks.

artificial intelligence, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

1806.02281

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.68)

Add feedback

NoSQL Data Architecture & Data Governance: Everything You Need to Know - DATAVERSITY

@machinelearnbotFeb-19-2018, 14:21:10 GMT

Click to learn more about author Akshay Pore. NoSQL databases have seen increased adoption in the last five years. A growing number of companies have at least one NoSQL database as a part of their enterprise data landscape. In today's IT environment, where DevOps, SAFe and Agile are being increasingly embraced, utilization of NoSQL database for application development is seen as a huge advantage in speeding up time to market for software products. Developers have been quick to adopt NoSQL databases due to their flexible schema, efficient processing & storage of unstructured & semi-structured data and ability to support high performance queries in a scale out environment.

artificial intelligence, database, information management, (15 more...)

@machinelearnbot

Technology:

Information Technology > Software Engineering (0.73)
Information Technology > Information Management (0.50)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.30)

Add feedback

Top NoSQL Database Engines

@machinelearnbotMay-2-2017, 15:30:07 GMT

I am not a fan of the term NoSQL. Many others are, however, and it has become a permanent part of the collective data storage nomenclature, meant to describe schema-less, non-relational data storage schemes. NoSQL is an umbrella term, one which encompasses a number of different technologies. These different technologies aren't even necessarily related in any way beyond the single defining characteristic of NoSQL: they are not relational in nature; for right or wrong, Structured Query Language (SQL) has become conflated with relational database management systems over the years. So, while I am not personally a fan of the term NoSQL, I can appreciate why others are, given that it quickly implies what it is we are talking about by explicitly stating what we are not talking about.

artificial intelligence, database, natural language, (13 more...)

@machinelearnbot

Technology:

Information Technology > Information Management (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.36)
Information Technology > Communications > Social Media (0.34)

Add feedback

Model-Parallel Inference for Big Topic Models

Zheng, Xun, Kim, Jin Kyu, Ho, Qirong, Xing, Eric P.

arXiv.org Machine LearningNov-9-2014

In real world industrial applications of topic modeling, the ability to capture gigantic conceptual space by learning an ultra-high dimensional topical representation, i.e., the so-called "big model", is becoming the next desideratum after enthusiasms on "big data", especially for fine-grained downstream tasks such as online advertising, where good performances are usually achieved by regression-based predictors built on millions if not billions of input features. The conventional data-parallel approach for training gigantic topic models turns out to be rather inefficient in utilizing the power of parallelism, due to the heavy dependency on a centralized image of "model". Big model size also poses another challenge on the storage, where available model size is bounded by the smallest RAM of nodes. To address these issues, we explore another type of parallelism, namely model-parallelism, which enables training of disjoint blocks of a big topic model in parallel. By integrating data-parallelism with model-parallelism, we show that dependencies between distributed elements can be handled seamlessly, achieving not only faster convergence but also an ability to tackle significantly bigger model size. We describe an architecture for model-parallel inference of LDA, and present a variant of collapsed Gibbs sampling algorithm tailored for it. Experimental results demonstrate the ability of this system to handle topic modeling with unprecedented amount of 200 billion model variables only on a low-end cluster with very limited computational resources and bandwidth.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Machine Learning

1411.2305

Country: North America > United States (0.28)

Genre: Research Report > New Finding (0.34)

Industry: Information Technology > Services (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.92)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.49)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.49)

Add feedback